Open
Conversation
M-Elsaeed
approved these changes
Apr 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add CloudWatch Monitoring for Integration Tests
This PR implements automated CloudWatch monitoring for the Ruby Runtime Interface Client integration tests to ensure continuous test health across all supported configurations.
What's Changed
Centralized Test Matrix Configuration
• Added
.github/test-matrix.jsonto define all test permutations in a single source of truth• Covers 16 combinations: 2 architectures (x64, arm64) × 4 distros (AL2023, Alpine, Debian, Ubuntu) × 2 Ruby versions (3.3, 3.4)
• Refactored workflows to share the same matrix logic, eliminating duplication
CloudWatch Alarm Infrastructure
• New workflow:
.github/workflows/bootstrap-alarms.yml• Bootstraps the Runtime AWS account with individual CloudWatch alarms for each test permutation (dynamically computed from the JSON matrix)
• Creates a composite aggregate alarm that triggers if ANY individual alarm fails or has insufficient data
• Alarms trigger if no successful test metric is received within 3 days (uses 1-day evaluation periods for faster state transitions)
• Idempotent operations: re-running won't destroy existing alarms
• Runs on pull requests and can be manually triggered
Enhanced Integration Tests
• Refactored .
github/workflows/integration-tests.ymlto use the shared test matrix• Added scheduled runs every workday (Mon-Fri at 08:00 UTC) to match our pipeline freshness policy
• Ensures the RIC works with newer versions of base OS on a daily basis
• Integrated AWS OIDC authentication for CloudWatch access
• Publishes success metrics to CloudWatch after each successful test run
• Metrics include dimensions: Distro, DistroVersion, RuntimeVersion, and Arch
• No-data scenarios trigger alarms, preventing silent failures from going unnoticed
Configuration
• Added required secrets:
AWS_ALARM_TARGET_ARN,AWS_OIDC_ROLE_ARN, andAWS_REGION• Current alarm action: SNS → email notifications to the team
• Future enhancement: Auto SIM ticket creation
Benefits
• Proactive monitoring: Get alerted when tests stop running or start failing consistently
• Centralized visibility: Single aggregate alarm for the entire test suite
• Daily validation: Continuous verification that RIC works with latest base OS versions
• Reduced duplication: Test matrix defined once and shared across workflows
• No silent failures: Missing metrics trigger alarms
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.